Pattern Matching In The Textract Information Extraction System

نویسندگان

  • Tsuyoshi Kitani
  • Yoshio Eriguchi
  • Masami Hara
چکیده

In information extraction systems, pattern marchers are widely used to identi~q infof mat|on of interest in a scntcncc. In this paper, pattern matching in the Tt:XTRACT information extraction system is described. It comprises a conccpt search which |dent|-tics key words representing a concept, and a template pattern search which identifies patterns of words and phrases. TI'JXTI~,A(;T using thc matcher performed wcll in the :I'IPSTER/MUC-5 evahtation. Thc pattern matching architecture is also suitable ]br rapid system development across different domains of the same language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NTT Data: Description of the Erie System Used for MUC-6

Erie is a name recognition system developed for the Multilingual Entity Task (MET) in MUC-6. The pattern matching engine recognizes organization, person, and place names along with time and numeric expressions in Japanese text. Although our previous information extraction system Textract performed well in MUC-5, the pattern matching engine, which was written in AWK language, was slow[2]. System...

متن کامل

Information extraction and evaluation

This topic session focussed on a variety of issues in evaluation (three presentations) and extraction (on e presentation) . For the extraction presentation, Tsuyoshi Kitani, a visiting researcher at the Center fo r Machine Translation at Carnegie-Mellon University gave a presention entitled "Overview of TEXTRAC T Template-Filling Solutions" . This talk gave an overview of TEXTRACT, which proces...

متن کامل

Local Derivative Pattern with Smart Thresholding: Local Composition Derivative Pattern for Palmprint Matching

Palmprint recognition is a new biometrics system based on physiological characteristics of the palmprint, which includes rich, stable, and unique features such as lines, points, and texture. Texture is one of the most important features extracted from low resolution images. In this paper, a new local descriptor, Local Composition Derivative Pattern (LCDP) is proposed to extract smartly stronger...

متن کامل

A Question Answering System Supported by Information Extraction

This paper discusses an information extraction (IE) system, Textract, in natural language (NL) question answering (QA) and examines the role of IE in QA application. It shows: (i) Named Entity tagging is an important component for QA, (ii) an NL shallow parser provides a structural basis for questions, and (iii) high-level domain independent IE can result in a QA breakthrough.

متن کامل

Matching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar

Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994